Skip to content

feat: Built-in agent — LLM-powered AEO analyst with chat API#74

Closed
arberx wants to merge 17 commits intomainfrom
feat/agent
Closed

feat: Built-in agent — LLM-powered AEO analyst with chat API#74
arberx wants to merge 17 commits intomainfrom
feat/agent

Conversation

@arberx
Copy link
Copy Markdown
Member

@arberx arberx commented Mar 16, 2026

Summary

  • Built-in LLM-powered agent with chat API, thread persistence in SQLite, and 9 domain-specific AEO tools (visibility runs, timeline analysis, evidence lookup, competitor analysis, keyword management, sitemap inspection, etc.)
  • CLI command (canonry agent) for interactive terminal chat sessions
  • Multi-provider support — agent threads can use OpenAI, Claude, or Gemini as the backing LLM
  • Security hardening — project ownership verification on thread endpoints, SSRF protection with DNS resolution on sitemap inspection, project-scoped run access
  • Bug fixes — history windowing (newest messages kept, not oldest), complete evidence across all runs, malformed JSON recovery for Claude message replay

Commits

  1. feat: built-in agent — LLM-powered AEO analyst with chat API — core agent loop, tools, store, routes, CLI command, DB migrations
  2. fix(security): Add project ownership verification to thread endpoints
  3. fix(agent): Add error handling for malformed JSON in tool call arguments
  4. perf(agent): Move dynamic imports to top-level
  5. refactor(agent): Replace circular HTTP self-calls with direct service layer
  6. style(agent): Remove dead code and unused types
  7. fix(agent): fix 5 bugs in agent loop, SSRF validation, and services — history windowing, SSRF validation, project-scoped getRun, complete getHistory, Claude replay fix

Test plan

  • CI passes (typecheck, lint, test)
  • SSRF validation tests cover IPv6 loopback, link-local, ULA, localhost, DNS resolution
  • Manual: canonry agent interactive session against a project with runs
  • Manual: verify thread persistence across server restarts

🤖 Generated with Claude Code

Comment thread packages/api-routes/src/agent.ts Outdated
Comment thread packages/api-routes/src/agent.ts
Comment thread packages/canonry/src/server.ts Outdated
Comment thread packages/canonry/src/server.ts Outdated
Comment thread packages/canonry/src/agent/loop.ts
Comment thread packages/canonry/src/agent/loop.ts
Comment thread packages/canonry/src/agent/llm.ts
Comment thread packages/canonry/src/agent/services.ts
Comment thread packages/canonry/src/agent/types.ts
Comment thread packages/db/src/migrate.ts
Comment thread packages/canonry/src/agent/loop.ts
Comment thread packages/canonry/src/sitemap-parser.ts
Comment thread packages/canonry/src/server.ts
Copy link
Copy Markdown
Member Author

@arberx arberx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Automated Review Summary

Files reviewed: 21
Comments left: 13

Issues found:

  • 🔴 Bug: 5
  • 🟡 Security: 2
  • 🟠 Performance: 1
  • 🔵 Type Safety: 2
  • 🟣 Testing: 1
  • ⚪ Style: 1
  • ⚪ Dead code: 1

Key findings

Bugs (fix before merge):

  1. Tool-call persistence ordering (loop.ts ~line 188) — assistant tool-call rows are stored after tool.execute() runs. If execution throws, the DB ends up with a tool result row but no matching assistant row, corrupting thread replay for Claude.
  2. Empty apiKey fallback (server.ts ~line 499) — apiKey ?? '' means a misconfigured provider silently constructs a working handler that returns 401 on every LLM call instead of returning undefined to disable the agent.
  3. ApiClient with undefined apiUrl/apiKey (server.ts ~line 503) — self-hosted instances without apiUrl set will get silent failures on run_sweep and all GSC tools.
  4. Orphaned messages on thread delete (agent.ts ~line 220) — ON DELETE CASCADE only fires if PRAGMA foreign_keys = ON, which is not guaranteed.
  5. Unbounded message field (agent.ts ~line 158) — no maxLength on the message body; a single large payload can rack up LLM token costs.

Security:

  • dns.resolve6 catch-all silently swallows errors (sitemap-parser.ts) — not exploitable but could incorrectly block IPv6-only hosts.
  • Missing message length limit (covered above under Bugs).

Testing gap:

  • ~830 lines of new agent code ship with zero unit tests. The loop's history-windowing, maxSteps fallback, and JSON recovery paths are all critical and unverified.

Performance:

  • getTimeline has an N+1 query pattern; getHistory already demonstrates the correct bulk-fetch approach.

What's done well ✅

  • The SSRF hardening in sitemap-parser.ts is thorough: DNS resolution, IPv4/IPv6 loopback, link-local, ULA, and IPv4-mapped IPv6 addresses are all covered, with tests.
  • History windowing (newest-N-ascending subquery) is the right approach and well-commented.
  • Provider abstraction is clean — adding a new LLM is a one-function addition in llm.ts.
  • Malformed JSON recovery in convertToClaudeMessages is a good defensive touch.
  • DB schema with ON DELETE CASCADE + composite index on (thread_id, created_at) is solid.

Overall assessment: NEEDS_WORK on the tool-call persistence bug and the ApiClient/apiKey issues before this is safe to merge.

This review was generated by an AI agent. Please verify all suggestions.

arberx added a commit that referenced this pull request Mar 17, 2026
- Fix tool-call persistence ordering: persist assistant row before
  tool.execute() so DB is never left with orphaned tool results
- Guard against empty apiKey: return undefined instead of silently
  constructing a broken handler
- Fall back to localhost:{port} when apiUrl is not configured so
  self-hosted instances can use HTTP-backed agent tools
- Explicitly delete agent_messages before thread deletion (don't
  rely on PRAGMA foreign_keys = ON)
- Add maxLength: 8000 on message body schema (Fastify/Ajv enforcement)
- Fix N+1 in getTimeline: bulk-fetch all snapshots with inArray
- Remove dead claude entry from PROVIDER_ENDPOINTS (uses dedicated path)
- Clean up duplicate projects import alias in server.ts
- Narrow dns.resolve6 catch to ENODATA/ENOTFOUND only
- Add CHECK constraint on agent_messages.role column

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
arberx and others added 12 commits March 17, 2026 14:02
Add a built-in AI agent that uses canonry's own tools to answer
AEO questions, run sweeps, and explain citation changes. No external
agent framework required — just the LLM provider already configured.

Architecture:
- Agent loop modeled after OpenClaw's pattern (LLM ↔ tool ↔ repeat)
- Uses existing provider API keys from canonry config
- Persistence in SQLite (same database, new tables)
- Provider priority: Claude > OpenAI > Gemini (configurable)

New files:
- packages/canonry/src/agent/ — core agent module
  - loop.ts: LLM ↔ tool execution cycle
  - llm.ts: provider-agnostic LLM layer (OpenAI, Claude, Gemini)
  - tools.ts: canonry operations as LLM-callable functions
  - store.ts: thread/message persistence (SQLite)
  - prompt.ts: AEO analyst system prompt
  - types.ts: shared type definitions
- packages/api-routes/src/agent.ts — REST API for chat
- packages/canonry/src/commands/agent.ts — CLI commands

CLI:
  canonry agent ask <project> "message"    — chat with the agent
  canonry agent threads <project>           — list threads
  canonry agent thread <project> <id>       — show thread history

API:
  POST   /api/v1/projects/:project/agent/threads              — create thread
  GET    /api/v1/projects/:project/agent/threads              — list threads
  GET    /api/v1/projects/:project/agent/threads/:id          — get thread + messages
  POST   /api/v1/projects/:project/agent/threads/:id/messages — send message
  DELETE /api/v1/projects/:project/agent/threads/:id          — delete thread

Config:
  agent:
    provider: claude|openai|gemini  (optional, auto-detects)
    model: string                   (optional, uses provider default)
    maxSteps: number                (default: 10)
    maxHistory: number              (default: 30)
    enabled: boolean                (default: true if provider available)

Tools exposed to agent:
  - get_status, run_sweep, get_evidence, get_timeline
  - list_keywords, list_competitors, get_run_details
  - get_gsc_performance, get_gsc_coverage, inspect_url

DB migration:
  - agent_threads: conversation threads per project
  - agent_messages: messages within threads (user/assistant/tool)

Closes #59
Fixes IDOR vulnerability where thread endpoints (get, send message, delete)
accepted a :project param but never verified the thread belonged to that project.

Now all three endpoints verify thread.projectId === project.id before allowing access.

Addresses review comment #1 (Security - CRITICAL)
Wrap JSON.parse(toolCall.function.arguments) in try-catch to prevent crashes
when LLMs return malformed JSON. On parse error, persist the error as a tool
result and continue the agent loop instead of crashing.

Addresses review comment #2 (Bug)
Replace dynamic imports of 'eq' and 'projects' table inside the message handler
with static top-level imports to eliminate async overhead on every message.

Addresses review comment #3 (Performance)
… layer

Create AgentServices class that provides direct DB access for agent tools,
eliminating the circular dependency where tools called the server's own HTTP API.

Most read-only tools (get_status, get_evidence, get_timeline, list_keywords,
list_competitors, get_run_details) now use direct DB calls via AgentServices.

Write operations (run_sweep) and external integrations (GSC) still use HTTP
for proper job orchestration and auth handling.

Benefits:
- Eliminates ~1-5ms HTTP localhost roundtrip per tool call
- Removes startup timing dependency
- Simplifies auth config

Addresses review comment #4 (Architecture)
- Remove redundant if/else in llm.ts that set the same Authorization header
- Remove unused ToolDefinition interface (actual interface is AgentTool)

Addresses review comments #5 and #6 (Style)
- P1: History windowing now returns newest N messages (was oldest N,
  causing long threads to drop the user's latest prompt)
- P1: SSRF validation now blocks localhost, IPv6 loopback/private,
  and resolves hostnames to verify they don't point to internal IPs
- P2: getRun() now requires projectName to prevent cross-project
  data access via known run IDs
- P2: getHistory() now queries snapshots for all returned runs
  (was only querying the first run ID)
- P2: convertToClaudeMessages() now handles malformed JSON in
  historical tool calls instead of crashing the thread

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix tool-call persistence ordering: persist assistant row before
  tool.execute() so DB is never left with orphaned tool results
- Guard against empty apiKey: return undefined instead of silently
  constructing a broken handler
- Fall back to localhost:{port} when apiUrl is not configured so
  self-hosted instances can use HTTP-backed agent tools
- Explicitly delete agent_messages before thread deletion (don't
  rely on PRAGMA foreign_keys = ON)
- Add maxLength: 8000 on message body schema (Fastify/Ajv enforcement)
- Fix N+1 in getTimeline: bulk-fetch all snapshots with inArray
- Remove dead claude entry from PROVIDER_ENDPOINTS (uses dedicated path)
- Clean up duplicate projects import alias in server.ts
- Narrow dns.resolve6 catch to ENODATA/ENOTFOUND only
- Add CHECK constraint on agent_messages.role column

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Critical fixes:
- Fix history truncation splitting tool-call pairs: trim orphaned
  tool/assistant messages at the window boundary
- Add per-thread concurrency guard (409 Conflict if thread is busy)
- Fix get_status returning oldest 3 runs (slice(-3) → slice(0,3))
- Resolve LLM config from registry at call time instead of capturing
  stale API key at startup
- Merge consecutive Claude tool results into single user message to
  avoid invalid same-role sequences

Important fixes:
- Add 20KB truncation cap on tool results to prevent blowing up
  LLM context window
- Guard against empty toolCalls array causing silent spin
- Add 90s timeout on all LLM fetch calls
- Return structured error responses (502 for LLM errors) instead of
  generic 500s
- Fix inconsistent return shape in getHistory (evidence → snapshots)
- Add maxLength/enum validation on thread title and channel fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename the agent to "Aero" across CLI output and error messages
- Add soul.md as the agent's identity/personality definition (checked
  into repo as the default, loaded from ~/.canonry/soul.md at runtime
  if the user wants to customize)
- Add memory.md as persistent context that Aero accumulates — loaded
  from ~/.canonry/memory.md at runtime so users can prime the agent
  with project-specific knowledge
- System prompt now composes: soul + project context + tools + memory
- Built-in soul is embedded in prompt.ts so it works after tsup bundling
- Agent remains fully optional: no background processes, only activates
  on explicit user request via CLI or API

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Users can now choose which LLM provider Aero uses per message:

- CLI: canonry agent ask <project> "msg" --provider claude
- API: POST /agent/threads/:id/messages { message, provider: "gemini" }

The provider field is optional — omitting it uses the default
(configured in agent.provider or auto-detected: claude > openai > gemini).
If the requested provider isn't configured, returns a clear error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds the /aero route with a full chat interface for interacting with the
built-in Aero agent. Includes project selector, provider/model selector,
thread management (create/delete), message display with optimistic
rendering, and a thinking animation during API calls.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
arberx and others added 5 commits March 17, 2026 15:12
… background processing

Major Aero agent improvements:

- Memory: get_memory/save_memory tools with pre-seeded domain knowledge
  (citation states, provider grounding mechanics, regression detection)
- Startup sequence: auto-gathers context on new threads, responds naturally
- System tools (opt-in): run_command, read_file, write_file, list_files,
  http_request — gated behind agent.systemTools config flag
- Write tools: add/remove keywords, add/remove competitors, update_project
- Background processing: send-message returns 202, UI polls for completion,
  agent work survives page navigation
- Chat UI: markdown rendering, auto-expanding textarea, inline thread rename,
  relative dates, cleaner thread list, no page scroll
- Claude API fix: bidirectional tool_use/tool_result validation prevents
  orphaned blocks from corrupting conversation history
- CLI polling: agent ask now polls thread status instead of blocking
- Remove unused footer, PATCH endpoint for thread rename, auto-titling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…quest size logging

- Reduce default maxHistoryMessages from 30 to 20 (fewer stale messages)
- Compress tool results older than 8 rows to 500 chars to prevent large
  get_evidence/get_memory results from inflating every subsequent request
- Add stderr logging per request: ~N tokens (M chars, K messages) for debugging
- Version 1.17.0 → 1.18.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…construction

The previous two-pass validation approach had edge cases where the passes
interacted in ways that still left orphaned tool_result blocks (causing
Claude 400 errors at messages.0.content.0).

New approach: state machine that walks the OpenAI-format messages once and
only emits a tool call group (assistant+tool_use → user+tool_result) when
ALL tool_use blocks have matching tool_result blocks. Incomplete groups from
truncated history or server crashes are dropped entirely. Consecutive
same-role messages are merged at the end.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Users can now pick a specific model (e.g. Sonnet vs Opus) from the chat
UI when a provider is selected. This avoids rate limit issues when the
provider-level config is set to a model with low rate limits.

Model priority: request model > agent config > provider config > default.
Also syncs DEFAULT_MODELS in llm.ts with MODEL_REGISTRY from contracts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve conflicts: keep both indexing API (from main) and agent
features (from feat/agent). Version stays at 1.19.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@arberx
Copy link
Copy Markdown
Member Author

arberx commented Mar 18, 2026

Unsure that this is necessary in Canonry at the moment.

The real power of canonry is leveraged through a "openclaw" type agent that has full accessibility to the host operating system with a well constructed memory.

This agent UI feels like any other agent UI. Not a fan so far.

@arberx arberx marked this pull request as draft March 18, 2026 00:00
process.stdout.write('Aero is thinking...')
}

await client.sendAgentMessage(project, threadId, message, opts?.provider)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling on sendAgentMessage: If this call throws (e.g. network error, 4xx/5xx), the exception propagates uncaught and the polling loop below never runs — which is fine. However, if the server accepts the message but never transitions out of 'processing', the loop silently times out and returns an empty string (see line 62–80). A tighter pattern would be to await this, and then check the thread status before entering the poll loop.

if (opts?.format !== 'json') process.stdout.write('.')
}

if (opts?.format !== 'json') console.log('\n')
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent timeout — no error signal when the agent takes >3 minutes: When i reaches 120 (120 × 1500ms = 3 min), the loop exits without setting process.exitCode or printing an error. The caller receives an empty response string and exit code 0, which looks like success.

// After the loop:
if (!response) {
  console.error('\nTimed out waiting for agent response (3 min).')
  process.exitCode = 1
  return
}

await validateSitemapUrl(url)
}

const res = await fetch(url)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DNS rebinding / TOCTOU: The DNS check in validateSitemapUrl (lines 66–88) happens before fetch(url). A malicious DNS server can return a public IP during validation, then switch to a private IP for the actual fetch request — this is a classic DNS rebinding attack.

Full mitigation requires resolving the hostname to an IP, asserting it's public, then connecting to that specific IP directly (e.g. by passing a custom agent to fetch that pins the resolved IP). The current approach significantly raises the bar vs. the old static regex check, but it is not bulletproof. Worth adding a comment documenting this known limitation so it's not mistaken for a complete fix.

const services = new AgentServices(db)

// ApiClient is only needed for HTTP-backed tools (run_sweep, GSC).
// If apiUrl/apiKey aren't set (self-hosted), those tools will gracefully error.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: systemTools defaults to false — good. Consider adding a prominent warning when it's enabled.

When agent.systemTools: true, the built-in agent gains shell execution, file I/O, and HTTP request capabilities. Since the agent operates on user-supplied message content, an adversarial input could abuse these tools. At minimum, log a startup warning when systemTools is true so operators notice, and document clearly in the config schema that this is a dangerous option.

Copy link
Copy Markdown
Member Author

@arberx arberx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Automated Review — PR #74 (incremental: new commits since last review)

Summary: Solid agent foundation. The LLM loop, tool system, and CLI commands are well-structured. A few issues flagged inline:

Severity Finding File
🟠 Bug Silent timeout — loop exits with exitCode=0 + empty response after 3 min commands/agent.ts:83
🟠 Bug --wait polls indexingState (wrong field — means allowed, not indexed) commands/google.ts (already in main)
🔴 Security DNS rebinding TOCTOU in validateSitemapUrl sitemap-parser.ts:110
🟡 Robustness sendAgentMessage failure not linked to poll-loop lifecycle commands/agent.ts:58
🟡 Security systemTools: true silently grants shell/file/network to agent server.ts:509

The --wait / INDEXING_ALLOWED bug is already on main — worth a hotfix or follow-up issue independent of this PR.

arberx added a commit that referenced this pull request Apr 18, 2026
Stack 2 pattern-prover: ports get_status, get_health, and get_timeline
from PR #74's shape to pi-agent-core's AgentTool with @sinclair/typebox
schemas. Locks the pattern before batching the remaining tools.

Tools consume the existing ApiClient directly — no AgentServices shim.
Aero uses the same API surface as any external agent, keeping the
agent-first contract.

Project name is bound via the ToolContext closure, not an LLM-visible
argument — prevents the model from targeting the wrong project.

- packages/canonry/src/agent/tools.ts — ToolContext, buildReadTools
- packages/canonry/test/agent-tools.test.ts — 6 tests covering tool
  construction, default params, override params, and filter behavior
- Bump to 2.0.2
arberx added a commit that referenced this pull request Apr 18, 2026
…ords, list_competitors, get_run)

Stack 4: brings the Aero read surface from 3 tools to 7.

- get_insights — intelligence engine output (regressions/gains/opportunities
  with cause + recommendation metadata). Agents should query this instead
  of re-deriving conclusions from raw timeline rows.
- list_keywords / list_competitors — tracking scope.
- get_run — drill into a specific run by id. Particularly useful after
  get_status surfaces a failed run.

Dropped get_evidence from the original PR #74 list — canonry's evidence
command is just getTimeline() with a "cited" boolean convenience, so it
would be redundant against the existing get_timeline tool.

- Bump to 2.2.0
@arberx arberx closed this Apr 18, 2026
arberx added a commit that referenced this pull request Apr 18, 2026
…shboard bar (#332)

* chore(agent): remove OpenClaw gateway and bundled runtime

Strips the OpenClaw-backed agent runtime ahead of the native in-process
loop. Keeps the external-agent webhook contract (`canonry agent attach
<project> --url <url>` / `agent detach`) so existing subscribers keep
working; drops the setup/install/lifecycle surface entirely.

BREAKING CHANGE: the following CLI commands are removed — `canonry agent
setup`, `canonry agent start`, `canonry agent stop`, `canonry agent
status`, `canonry agent reset`. `canonry agent attach` now requires
`--url <webhook-url>` instead of deriving the URL from the former
`config.agent.gatewayPort`. The `config.agent.{binary,profile,autoStart,
gatewayPort}` fields are removed; only `config.agent.mode` remains
(reserved until the native loop ships). Users with an orphaned
`~/.openclaw-aero/` directory get a one-time boot-time warning.

- Deletes agent-bootstrap.ts, agent-manager.ts, and their tests
  (agent-bootstrap.test, agent-manager.test, agent-config.test,
  agent-commands.test, agent-webhook.test).
- Trims agent-webhook.ts to just the webhook event list consumed by
  `agent attach`.
- Updates server.ts to drop AgentManager construction, auto-attach
  webhook hook, and graceful shutdown of the gateway.
- Rewrites commands/agent.ts and cli-commands/agent.ts to expose only
  attach/detach; adds `--url` to attach.
- Drops the OpenClaw integration test script.
- Keeps the aero skill — it's target-agnostic (Claude Code, Codex,
  pi-agent-core all read the same skill content). Rewrites its
  memory-patterns reference to match the "canonry is source of truth,
  query don't duplicate" model.
- Refreshes AGENTS.md files, README.md, and the canonry-setup CLI
  reference to reflect the new surface.
- Major version bump: 1.48.4 → 2.0.0.

* feat(agent): scaffold pi-agent-core integration

Adds @mariozechner/pi-agent-core, @mariozechner/pi-ai, and
@sinclair/typebox as direct deps on packages/canonry. Introduces
packages/canonry/src/agent/pi-runtime.ts — a thin factory that
constructs a pi-agent-core Agent scoped to a canonry project.

Stack 1 of the native agent loop: proves the dep graph and import
surface. Upcoming stacks wire convertToLlm, transformContext,
event-driven persistence, tool definitions, and the beforeToolCall
policy gate.

- Bump to 2.0.1.

* feat(agent): port 3 read tools to pi-agent-core shape

Stack 2 pattern-prover: ports get_status, get_health, and get_timeline
from PR #74's shape to pi-agent-core's AgentTool with @sinclair/typebox
schemas. Locks the pattern before batching the remaining tools.

Tools consume the existing ApiClient directly — no AgentServices shim.
Aero uses the same API surface as any external agent, keeping the
agent-first contract.

Project name is bound via the ToolContext closure, not an LLM-visible
argument — prevents the model from targeting the wrong project.

- packages/canonry/src/agent/tools.ts — ToolContext, buildReadTools
- packages/canonry/test/agent-tools.test.ts — 6 tests covering tool
  construction, default params, override params, and filter behavior
- Bump to 2.0.2

* feat(agent): canonry agent ask — one-shot CLI backed by pi-agent-core

Stack 3 of the native agent loop: wires a full Aero session and exposes
it as `canonry agent ask <project> "<prompt>"`. First dogfoodable slice
of the native loop.

Session module (agent/session.ts) composes:
  - System prompt loaded from skills/aero/SKILL.md (bundled asset or
    repo-root fallback)
  - Pi-ai model resolution (default anthropic/claude-opus-4-7; falls back
    through openai and google based on which canonry API key is present)
  - Read tools from the stack 2 port (get_status, get_health, get_timeline)
  - getApiKey resolver that maps pi-ai provider names to the canonry
    config keys (anthropic→claude, google→gemini)

CLI command (commands/agent-ask.ts) subscribes to AgentEvents and prints
them — tool calls, tool results, assistant text. Supports --provider,
--model, --format json.

Integration tests use pi-ai's faux provider to exercise the full
prompt→events→idle lifecycle without hitting a real LLM. 7 tests cover
prompt loading, provider detection, end-to-end event sequence, and the
no-provider-configured error path.

- Bump to 2.1.0 — first shippable agent feature.

* feat(agent): add z.ai (GLM) provider + env-var API key fallback

Wires pi-ai's built-in `zai` provider into session.ts as a fourth
SupportedAgentProvider option, with glm-5.1 as the default model.
detectAgentProvider now considers zai alongside anthropic, openai,
and google.

Extends buildApiKeyResolver and detectAgentProvider with a
pi-ai getEnvApiKey() fallback — if no canonry config entry is
present, pulls from ANTHROPIC_API_KEY / OPENAI_API_KEY /
GEMINI_API_KEY / ZAI_API_KEY. Removes the need to persist
ephemeral keys to ~/.canonry/config.yaml when dogfooding.

CLI: `canonry agent ask --provider zai` is now valid.

- Bump to 2.1.1

* feat(agent): batch-port remaining read tools (get_insights, list_keywords, list_competitors, get_run)

Stack 4: brings the Aero read surface from 3 tools to 7.

- get_insights — intelligence engine output (regressions/gains/opportunities
  with cause + recommendation metadata). Agents should query this instead
  of re-deriving conclusions from raw timeline rows.
- list_keywords / list_competitors — tracking scope.
- get_run — drill into a specific run by id. Particularly useful after
  get_status surfaces a failed run.

Dropped get_evidence from the original PR #74 list — canonry's evidence
command is just getTimeline() with a "cited" boolean convenience, so it
would be redundant against the existing get_timeline tool.

- Bump to 2.2.0

* feat(agent): write tools — run_sweep, dismiss_insight, add_keywords, add_competitors, update_schedule, attach_agent_webhook

Stack 5: gives Aero the ability to act, not just analyze. The agent now
closes the "want me to kick that off?" loop that the previous stacks
ended on — run_sweep actually triggers the sweep, attach_agent_webhook
wires external agents, and so on.

Six additive-only write tools (no destructive surface yet — Aero can
recommend removals in prose, not enact them):

- run_sweep — POST /projects/:name/runs, optional provider filter
- dismiss_insight — POST /intelligence/insights/:id/dismiss
- add_keywords — POST /projects/:name/keywords (append semantic)
- add_competitors — read + merge-dedup + PUT /competitors
- update_schedule — PUT /projects/:name/schedule with cron xor preset
- attach_agent_webhook — idempotent notification create, source='agent'

Write tool calls surface via tool_execution_start events so the user
sees exactly what fired. No confirmation gating in this stack — opt-in
by running `canonry agent ask`. Confirmation policy lands alongside
the UI stack when a chat surface exists to ask through.

buildAllTools(ctx) combines reads + writes (13 tools total); session.ts
now defaults to the full set. Callers can narrow to reads-only via the
`tools` override.

- Bump to 2.3.0

* feat(agent): persistent session registry — agent_sessions table + SessionRegistry

Stack 6a of proactive Aero: the hybrid persistence layer underneath
RunCoordinator → agent.followUp(). Live pi-agent-core Agent instances
stay in memory per project; the durable state (transcript + queued
follow-up messages + chosen provider/model) lives in agent_sessions.

Schema:
- agent_sessions table — one row per project (UNIQUE on project_id).
  Columns: system_prompt, model_provider, model_id, messages JSON,
  follow_up_queue JSON, created_at, updated_at.
- Migration v38 added to packages/db.

SessionRegistry API:
- getOrCreate(projectName) — returns cached live Agent, hydrates from
  DB if persisted (draining follow_up_queue into the live followUp
  queue), or constructs + inserts a fresh row.
- save(projectName) — persists state.messages back to the row.
- queueFollowUp(projectName, msg) — forwards to live agent if cached;
  otherwise appends to the DB row's queue; buffers pre-session messages
  until the first getOrCreate creates the row.
- evict(projectName) / clear() — drop live Agent(s); durable state
  untouched.

No behavior change for `canonry agent ask` yet — the CLI still uses the
per-invocation createAeroSession path. Stack 6b wires RunCoordinator
to the registry so run.completed / insight.* events drive followUp;
stack 6c drains queued follow-ups unprompted (the actual proactive
moment).

- docs/data-model.md gets an Agent section.
- Tests cover insert, live hot path, rehydration-with-queue-drain,
  idle queue persistence, and the pre-session buffer.
- Bump to 2.3.1

* feat(agent): proactive Aero — registry-driven CLI + RunCoordinator wake-up

Completes stack 6: Aero now wakes up unprompted when runs complete,
and CLI conversations thread across invocations via the persistent
session registry.

SessionRegistry refactor:
- Drops reliance on pi's internal follow-up queue. The registry owns
  the pending buffer directly; drainNow / the next user prompt
  consumes and forwards via agent.prompt().
- queueFollowUp routes by session liveness (live → pending Map; idle
  → persisted queue).
- drainNow (async, fire-and-forget safe) hydrates if needed, consumes
  pending, prompts the Agent, saves transcript back.
- consumePending exposes the drain primitive so the CLI can bundle
  queued events into the next user prompt in a single turn.

CLI switch (commands/agent-ask.ts):
- Opens a DB connection + migrates, constructs SessionRegistry per
  invocation, getOrCreate the session, bundles any pending events in
  front of the user's prompt, runs, then saves.
- Conversations now persist across `canonry agent ask` invocations —
  the next call sees prior transcript and any queued follow-ups.

RunCoordinator wiring:
- New OnAeroEvent callback added as a third subscriber (after
  intelligence + notifier). Receives runId / projectId / insight
  counts; returns Promise<void>.
- server.ts constructs SessionRegistry + ApiClient directly from the
  loaded config (not loadConfig, to avoid test-ordering issues) and
  passes a callback that enqueues a "[system] Run X completed…"
  message and fires drainNow.

Tests: registry suite refactored to match the new pending-buffer model
(10 cases). Full workspace: 1002 tests green.

- Bump to 2.4.0 — first stack where Aero acts without being prompted.

* fix(agent): prevent duplicate follow-up + drain pending after each turn

Two bugs surfaced by the first live proactive wake-up:

1. queueFollowUp wrote to BOTH the in-memory pending Map AND the DB
   follow_up_queue when no live session existed. Then drainNow's
   getOrCreate hydrated the DB queue INTO pending, producing a second
   copy. First end-to-end test showed the [system] "Run X completed"
   message appearing twice in the transcript.

   Fix: queueFollowUp now writes to exactly one sink — pending when
   live, DB when idle. Hydration is the only path that moves DB →
   pending.

2. No drain trigger between turns. If RunCoordinator fired while a
   CLI session was mid-turn, the queued message landed in pending but
   nothing drained it until someone called drainNow explicitly.

   Fix: getOrCreate now subscribes to agent_end on the live Agent and
   fires drainNow() whenever pending has items. Re-entrant safely —
   the drain calls prompt() which itself emits a new agent_end, so we
   stay single-threaded via pi's internal guards.

Added a regression test covering the duplicate scenario (evict →
queueFollowUp while idle → drainNow should yield exactly one copy of
the queued message in the transcript).

- Bump to 2.4.1

* feat(agent): SSE agent routes — transcript GET/DELETE + prompt stream

Stack 7a of the dashboard UI: thin Fastify routes that wrap the
SessionRegistry for the browser. Same abort/save lifecycle as the CLI
path; envelope shape differs only by the stream_open / stream_close
control frames that let the client distinguish a clean close from a
network drop.

Routes (all under `${apiPrefix}/projects/:name/agent/`):
  GET    .../transcript — current rolling messages + model config
  DELETE .../transcript — reset the conversation
  POST   .../prompt     — { prompt, provider?, modelId? } → SSE of
                          AgentEvent JSON lines

SSE envelope: each frame is `data: <JSON>\n\n`. Pi-ai AgentEvents pass
through verbatim; stream_open + stream_close bracket the conversation;
error frames surface prompt failures without collapsing the stream.

Client disconnect aborts the live Agent (`agent.abort()`), so navigating
away mid-turn stops the LLM call instead of burning tokens to /dev/null.

Registered in server.ts before apiRoutes so the shared base prefix +
session registry are already in scope. Not wired from api-routes —
Aero stays canonry-local until the cloud API explicitly opts in.

- 1003 tests green
- Bump to 2.5.0

* feat(web): Aero bottom command bar — dashboard surface for the native loop

Stack 7b of the native agent loop: the browser-facing UI for Aero.

Design (per project_aero_ui_direction memory): a fixed bottom bar,
not a chat panel. Collapsed state shows "Ask Aero about <project>…";
clicking expands upward into a composer + rolling transcript. Only
renders on project-scoped routes — hidden on overview, settings, setup,
since there's no project context to ask about.

Components:
- apps/web/src/api-aero.ts — typed AeroEvent / AeroMessage shape,
  fetchAeroTranscript / resetAeroTranscript / promptAero. promptAero
  parses the SSE framing (data: JSON\n\n) and fires onEvent per frame;
  the caller passes an AbortSignal that translates to canceling the
  underlying fetch (server aborts the run on disconnect).
- apps/web/src/components/shared/AeroBar.tsx — bar + AeroBarHost.
  Host reads router location and shows the bar iff the path matches
  /projects/<name>. Starter buttons (Status / Top insights / Last
  failed run / Schedule) fire canned prompts for zero-friction first
  use. Tool calls stream inline as emerald pills. [system] follow-up
  messages from RunCoordinator are filtered out of the transcript —
  they're internal plumbing, not user-facing.
- Mounted in App.tsx RootLayout alongside Toaster.

The in-turn streaming text is mirrored from message_update events so
users see tokens as they arrive. On message_end the streaming buffer
clears and the final message slots into the transcript via a fresh
transcript fetch (resyncs in case of events that landed post-end).

- 1003 tests green
- Bump to 2.6.0

* docs(agent): sync AGENTS.md + README + canonry-cli skill for native Aero

Stack 8 of the native agent loop: bring the user- and agent-facing docs
into line with the shipped surface. The previous pass (commit 573cb96
during the OpenClaw rip-out) described a stripped-down, webhook-only
world; that's no longer accurate — we've shipped the full Aero agent
on pi-agent-core with 13 tools, proactive wake-ups, and a dashboard
bar. Docs catch up.

- AGENTS.md (root): Agent Layer section rewritten. CLI ref updated with
  `agent ask`. Key files listed (session.ts, session-registry.ts,
  tools.ts, agent-routes.ts, AeroBar.tsx). External-agent webhook path
  kept as a separate subsection for BYO-agent users.
- packages/canonry/AGENTS.md: Key Files table now lists the five new
  agent-module files. Agent layer section split into built-in / external.
- README.md: "Talking to Aero" section for the CLI; "Bringing your own
  agent" for webhooks. First bullet of Features promotes the built-in
  agent. Intro paragraph mentions Aero + pi-agent-core by name.
- skills/canonry-setup/references/canonry-cli.md: ask command with
  provider table + env-var fallback order. Persistence behavior
  documented.

Docs-only — no version bump.

* fix(agent): move prompt-stream abort listener from request to response side

Symptom: the SSE prompt endpoint produced assistant messages with
empty content and stopReason=aborted. No LLM output surfaced in the
dashboard bar.

Root cause: `request.raw.on('close', ...)` fires as soon as the client
finishes uploading the POST body — normal for every POST — not when
the client disconnects from the response stream. So the abort handler
was firing immediately after the prompt arrived, canceling the agent
before pi-ai even started the LLM request.

Fix: listen on `reply.raw.on('close')`. The response-side socket
close fires only when the client actually drops the connection mid-
stream, which is the signal we actually want.

Verified via `curl -X POST … /agent/prompt`: full event sequence now
fires (tool_execution_start/end, final message_end with real
assistant text, clean stream_close).

- Bump to 2.6.1

* fix(agent): code-review fixes — auth, listener leaks, reader abort

Addresses findings from the pre-PR code review:

BLOCKER: The /api/v1/projects/:name/agent/* routes were registered on
the outer Fastify instance, bypassing the authPlugin that's scoped
inside apiRoutes' encapsulated plugin. Anyone reaching the port could
read transcripts, reset them, or drive Aero with the operator's LLM key.

Fix: api-routes now exposes a `registerAuthenticatedRoutes` hook that
runs inside the authenticated scope. Canonry passes `registerAgentRoutes`
through this hook so Aero shares the bearer-key / session-cookie auth.
Verified via `curl` — 401 without auth, 200 with the api key.

Other fixes:
- agent-routes.ts: `reply.raw.once('close')` instead of `.on('close')` so
  we don't retain the closure over `agent` across GCs.
- session-registry.ts: `drainNow` now leaves pending messages in the queue
  when `isStreaming` is true (the agent_end drain hook will pick them up).
  Prevents event loss if two run.completed events land back-to-back.
- commands/agent-ask.ts: sets `process.exitCode = 2` when a stream emits
  an assistant message with `stopReason === 'error'` or an errorMessage.
  Agents scripting against the CLI now get a non-zero exit on silent
  provider failures instead of a false success.
- api-aero.ts: wire the caller's AbortSignal to `reader.cancel()` so
  aborting a prompt mid-stream unblocks `reader.read()` immediately.
- AeroBar.tsx: use a stable key (`role:timestamp:index`) for message
  rows so React doesn't churn when the transcript re-fetch returns.
- Deleted unused `src/agent/pi-runtime.ts` + its test — callers use
  `@mariozechner/pi-agent-core` directly.

Verified live:
- auth: 401 vs 200 on agent routes
- full SSE turn: token-by-token streaming through message_update, clean
  stream_close, final assistant text

- 1000 tests green
- Bump to 2.6.2

* fix(agent): staff-review pass — CLI→HTTP, scope gate, 409 on concurrent, model override, proactive polling

Addresses the five P1/P2 findings from the pre-merge review.

P1-1: canonry agent ask was running its own local DB + SessionRegistry
which broke against remote/shared canonry servers. Now the CLI posts
to `/api/v1/projects/:name/agent/prompt` with `scope: 'all'` and
parses the SSE stream — same session store as the dashboard, one
execution path. Auth via the bearer api key; SIGINT cancels the
in-flight fetch.

P1-2: Two overlapping `/agent/prompt` requests on the same project
shared the same per-project Agent instance — tool/message events
cross-streamed, and either client closing could abort the other's
run. Added a `state.isStreaming` guard that returns `409 AGENT_BUSY`
with the new `agentBusy()` error factory. Verified: two parallel
curls → first=200, second=409.

P1-3: Dashboard sessions were getting the full read+write toolset
with no confirmation UX, so a free-form prompt could trigger sweeps
or mutate schedules from the command bar. Split `toolScope`:
- Dashboard `/agent/prompt` defaults to `read-only` (7 tools).
- CLI passes `scope: 'all'` to keep write tools available (13 tools).
Tools swap per-request on the cached Agent; safe because we now 409
concurrent requests (the Agent is idle when tools swap).

P2-4: `provider` / `modelId` flags were silently ignored once a
session row existed — `getOrCreate` always rehydrated with the
persisted model. Now explicit preferences override the persisted
values AND are persisted back, so subsequent invocations use the new
model unless the caller specifies otherwise.

P2-5: Proactive turns from RunCoordinator wake-ups were invisible in
the dashboard because the bar only fetched transcript on open or
after a user prompt. Added a 15-second poll while the bar is open
and no prompt is in flight. Server-initiated turns now surface
within a poll cycle.

Addressed open questions:
- CLI transcript parity: added `canonry agent transcript <project>`
  and `canonry agent reset <project>` subcommands (GET + DELETE
  transcript). Previously only the dashboard could read/reset the
  transcript; agents scripting the CLI had no equivalent.
- OpenAPI coverage: new `agent/*` endpoints documented via an opt-in
  `canonryLocalRouteCatalog` that only activates when the caller
  passes `includeCanonryLocal: true` (canonry does; shared api-routes
  doesn't, so the strict contract test still passes).

- New error code: `AGENT_BUSY` (409) in packages/contracts/src/errors.ts
- Added `scope?: 'all' | 'read-only'` to the prompt request body

- 1000 tests green
- Bump to 2.7.0 (minor: two new CLI commands, new error code,
  behavior change on dashboard tool scope)

* chore: hold version at 2.0.0 — native-agent-loop ships as a single release

Reverts the 2.0.1 → 2.7.0 churn introduced across the branch. Upstream
main is still on 1.x; this entire feature will land as 2.0.0.

* refactor(agent): consolidate provider/model maps into a single registry

The three hand-written parallel maps — `SupportedAgentProvider` union,
`DEFAULT_MODEL_IDS`, `CANONRY_PROVIDER_KEY` — plus the auto-detect
priority list and the CLI's separate `AGENT_PROVIDERS` validation array
were five places a new provider had to be added. And nothing tied them
together, so a typo or missing entry was a silent bug at runtime.

New shape in `packages/canonry/src/agent/providers.ts`:

  AGENT_PROVIDERS = {
    anthropic: { piAiProvider, label, canonryConfigKey, defaultModel, autoDetectPriority },
    openai: {...},
    google: {...},
    zai: {...},
  } as const satisfies Record<string, AgentProviderEntry>

Everything downstream is derived:
- `SupportedAgentProvider` is `keyof typeof AGENT_PROVIDERS`.
- `AgentProviders` is the canonical enum constant (like RunKinds).
- `listAgentProviders`, `agentProvidersByPriority`, `getAgentProvider`,
  `coerceAgentProvider`, `findByPiAiProvider`, `resolveApiKeyFor`,
  `resolveModelForProvider` all read from the single table.
- `validateAgentProviderRegistry()` runs at the first session construction
  and throws if any default model is missing from the installed pi-ai
  catalog — surfaces registry drift early instead of at a user's first
  prompt.

Adding a new provider (say Mistral or Bedrock) is now one row in the
registry. CLI validation, auto-detect priority, env-var resolution,
and model defaulting all update without further edits.

Call sites updated:
- session.ts uses the registry helpers; dropped the hand-rolled maps.
- cli-commands/agent.ts uses `coerceAgentProvider` + `listAgentProviders`
  instead of maintaining its own parallel array.

12 new tests in agent-providers.test.ts cover registry invariants
(unique priorities, every default resolves against pi-ai, every row
has the required fields, coercion behavior, apiKey resolution).

- 1012 tests green (was 1000, +12 for the registry suite)

* fix(agent): reviewer pass 3 — acquireForTurn guard, CLI via ApiClient+CliError, hot-session model swap

Addresses the P1/P2 findings from the staff re-review.

P1-1: Busy-check now runs BEFORE any cached-Agent mutation.

Introduced `SessionRegistry.acquireForTurn(name, prefs)`. It:
  1. getOrCreate (pure — never mutates a cached Agent).
  2. Throws `AGENT_BUSY` (409) if `state.isStreaming` is true.
  3. Only after the busy guard passes does it align tool scope and
     optionally swap the model.

Previously `getOrCreate` eagerly re-scoped tools on every lookup and
THEN the route checked busy — so a dashboard read-only request could
swap tools out from under an in-flight CLI `scope: 'all'` turn before
getting its 409. Agent-routes now calls `acquireForTurn`; drainNow
does too (catches AGENT_BUSY and leaves pending for the agent_end
hook). Added a regression test that asserts `acquireForTurn` throws
without mutating `state.tools` when the Agent is streaming.

P1-2: CLI goes through `createApiClient()` + `CliError`.

`agent-ask.ts` and `agent-transcript.ts` no longer build raw URLs
and bypass ApiClient. Added `ApiClient.streamPost(path, body, signal)`
that shares the existing probe + auth + structured-error path, returning
a Response whose body the caller streams. Added
`ApiClient.getAgentTranscript()` / `resetAgentTranscript()` for the
transcript + reset subcommands. Errors now surface via `printCliError`
with a proper `CliError.exitCode`, matching the repo's 0/1/2 contract.

The ApiClient `/health` probe also re-engages for these calls, so
reverse-proxied deployments without a local basePath config resolve
correctly (previously we'd 404).

P2-1: `--provider` / `--model` now affect hot cached sessions too.

`acquireForTurn` aligns `state.model` on the cached Agent (not just
on DB rehydration) when preferences change the provider or model id,
and persists the new choice back to the `agent_sessions` row.
Regression test verifies a cached agent's `state.model` changes when
`provider: 'zai'` is passed to a session that was constructed with
`provider: 'anthropic'`.

P2-2: Version manifests intentionally reverted to 2.0.0.

Earlier review reply mentioned 2.7.0; that was per-commit churn the
repo owner asked me to roll back. The feature ships as a single 2.0.0
release — upstream main is still on 1.x.

- +3 tests → 1015 workspace tests green

* feat(web): Aero bar expand-to-fullscreen toggle

Adds Maximize/Minimize icons in the bar header. Clicking promotes the
bar to a near-fullscreen overlay (max-w-5xl, backdrop-blurred, textarea
grows to 3 rows). Clicking again snaps back to the compact bottom bar.

Escape key collapses expanded → compact → closed in sequence. Clicking
the backdrop in expanded mode collapses back to compact (but doesn't
close the session — the transcript persists).

No logic changes. Pure presentation state. AeroBar still renders nothing
outside project routes.

* feat(web): Aero bar — typing indicator + rendered markdown

Two UX fixes from the live dashboard pass.

Typing indicator: three pulsing emerald dots labeled "Aero" appear in
the transcript whenever the session is streaming but hasn't emitted
any assistant text or tool pills yet. Covers:
  - the pre-first-token "thinking" moment after a user prompt,
  - the post-tool-result "analyzing" moment between tool rounds.
Hides as soon as streaming text arrives or a tool-execution pill takes
the spotlight. Respects prefers-reduced-motion.

Markdown rendering: replaces the raw-text transcript with
react-markdown. Headings, tables, lists, bold/italic, inline + block
code, blockquotes, hr, and links all render with tailwind overrides
that match the zinc/emerald dashboard palette — no browser-default
blue underlines or black text. Links open in a new tab with
noopener/noreferrer. Both the finalized assistant messages and the
in-flight streamingText go through the same renderer, so tokens
arrive formatted rather than as raw asterisks.

- new dep: react-markdown in apps/web
- new CSS: .aero-dot keyframes in styles.css

* feat(agent): provider switch, tool trails, slash palette, copy-as-CLI in AeroBar

Adds four dashboard agent-surface features so Aero feels like a real agentic
console instead of a chat box:

- Provider picker in the AeroBar header with per-turn override. New
  GET /projects/:name/agent/providers endpoint returns the full registry
  (provider id, model id, keySource, defaultProvider), backed by the new
  AgentProvidersResponse DTO in @ainyc/canonry-contracts.
- Inline tool trails render each tool call as a collapsible card with
  running/ok/failed state, duration, and expandable args + result JSON —
  pulled from the SSE tool_execution_* frames.
- Slash-command palette: `/` in the composer opens a Raycast-style menu
  of 8 curated prompts (status, insights, last-run, last-failed,
  run-sweep, schedule, keywords, competitors) with live filter and
  Arrow/Tab/Enter/Escape keybindings.
- "Copy as CLI" on hover of user messages writes `canonry agent ask
  <project> "<prompt>"` to the clipboard with POSIX-safe quoting,
  honoring the agent-first CLI/API parity principle.
- Context pills above the composer surface project/model/scope. A scope
  chip toggles read-only ↔ all tools; the all-tools state is amber to
  signal elevated access. The scope flows through promptAero → the
  prompt endpoint → SessionRegistry.acquireForTurn so the pi-agent-core
  tool surface is filtered per-turn.
- 91 lines of tests covering provider registry, resolveApiKeyFor/Source,
  and buildAgentProvidersResponse under config vs env key sourcing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(agent): reviewer pass 4 — scope parity, providers CLI, OpenAPI

Addresses three P2 findings against a18bb82:

* Copy-as-CLI now threads the current UI scope through `canonry agent
  ask --scope`, so a pasted read-only turn can't quietly upgrade to
  write-capable. CLI default stays `all`; emit is omitted when UI is in
  `all` mode to keep pastes terse. New vitest covers the three paths.
* `canonry agent providers <project>` — CLI parity for the dashboard
  provider picker, with `--format json` and the same
  `AgentProvidersResponse` shape the UI consumes.
* `/api/v1/projects/{name}/agent/providers` is now listed in the
  canonry-local OpenAPI catalog (the route existed but wasn't
  documented).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(agent): review pass 5 — scope-preserving drain, ER diagram, --scope skill docs

- Preserve the session's current tool scope during proactive drains in
  SessionRegistry.drainNow; fail-closed to 'read-only' when no scope is
  cached. Prevents a run.completed-triggered follow-up from silently
  escalating a read-only dashboard session to the full 13-tool write
  surface.
- Add agent_sessions to the docs/data-model.md ER diagram (prose half was
  already present).
- Document --scope all|read-only on canonry agent ask in the
  canonry-setup skill reference, including the dashboard's read-only
  default rationale.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(canonry): refresh bundled SPA asset hash

Rebuild of the bundled dashboard produced by the local test run.
index.html now references the new hashed bundle name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(agent): soul.md grounding + progressive-disclosure skill docs

Compose Aero's system prompt from two files: skills/aero/soul.md
(identity/values/voice/boundaries) + skills/aero/SKILL.md (task rules).
Soul is prepended so identity frames judgment.

Add two skill-doc tools for progressive disclosure of bundled reference
playbooks — SKILL.md stays lightweight, playbooks load on demand:
- list_skill_docs — scans references/*.md, parses description frontmatter
- read_skill_doc({ slug }) — validates slug against manifest, returns content

Skill-doc tools ride in every scope (read-only and all). Registry's
alignScope now preserves them across scope realignment.

Consolidate soul to a single source — delete the duplicate
assets/agent-workspace/SOUL.md workspace-root copy. Built-in agent and
external-agent workspace both reach the same skills/aero/soul.md via the
copy-agent-assets build step.

Version 2.0.2 → 2.1.0 (new tool surface).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(agent): align Aero provider IDs with sweep (anthropic→claude, google→gemini)

Aero identified LLM backends as anthropic/openai/google/zai while the sweep
side uses claude/gemini/openai. Operators saw two vocabularies for the same
concept. Standardize on the sweep naming and expose a canonical ProviderIds
enum in @ainyc/canonry-contracts that both surfaces reference.

- New contracts/src/providers.ts: ProviderIds + AgentProviderIds + SweepProviderIds
- Rename AGENT_PROVIDERS keys; drop canonryConfigKey (id === config key now)
- DB migration v39 rewrites existing agent_sessions.modelProvider values
- CLI --provider anthropic/google now rejected; use claude/gemini

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(api): source agent provider enum from canonical AGENT_PROVIDER_IDS

The /agent/prompt OpenAPI spec still advertised the old
['anthropic','openai','google','zai'] enum after the rename to
['claude','openai','gemini','zai'], so spec-driven clients would send
stale values and crash resolveModelForProvider. Importing
AGENT_PROVIDER_IDS from @ainyc/canonry-contracts keeps the spec and
the runtime validator in lock-step across future renames.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(agent): proactive drain on cold sessions + authoritative transcript reset

Two reviewer findings on the native Aero loop:

1. drainNow returned early when the in-memory pending map was empty, so
   queueFollowUp → drainNow on a cold / post-restart session never woke
   the agent: the follow-up sat in the DB queue until a manual prompt
   hydrated the session. drainNow now checks both in-memory pending and
   the persisted follow_up_queue via hasPendingWork(); acquireForTurn →
   getOrCreate handles the hydration + DB→pending migration.

2. DELETE /agent/transcript wiped the DB row but only called evict(),
   leaving the in-memory pending buffer and scope cache intact. A system
   follow-up queued on a hot session could leak into the next prompt
   after a reset. New SessionRegistry.reset() clears live agent +
   pending + scopes; the route uses it in place of evict().

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(server): inject <base href> unconditionally so SPA deep-links work

Without an explicit basePath, the built index.html's relative
`./assets/...` paths resolved against the current URL — visiting
`/projects/:name` directly fetched `/projects/assets/index-*.js`,
hit the SPA fallback, and received HTML where the browser expected
JS, so React never mounted. Always emit `<base href="${basePath ?? '/'}">`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: revert version bump, keep at 2.0.0

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant